Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance

نویسندگان

  • John Kalamatianos
  • Alireza Khalafi
  • David R. Kaeli
  • Waleed Meleis
چکیده

ÐIn this paper, we examine temporal-based program interaction in order to improve layout by reducing the probability that program units will conflict in an instruction cache. In that context, we present two profile-guided procedure reordering algorithms. Both techniques use cache line coloring to arrive at a final program layout and target the elimination of first generation cache conflicts (i.e., conflicts between caller/callee pairs). The first algorithm builds a call graph that records local temporal interaction between procedures. We will describe how the call graph is used to guide the placement step and present methods that accelerate cache line coloring by exploring aggressive graph pruning techniques. In the second approach, we capture global temporal program interaction by constructing a Conflict Miss Graph (CMG). The CMG estimates the worst-case number of misses two competing procedures can inflict upon one another and reducing higher generation cache conflicts. We use a pruned CMG graph to guide cache line coloring. Using several C and C++ benchmarks, we show the benefits of letting both types of graphs guide procedure reordering to improve instruction cache hit rates. To contrast the differences between these two forms of temporal interaction, we also develop new characterization streams based on the Inter-Reference Gap (IRG) model. Index TermsÐInstruction caches, program reordering, temporal locality, conflict misses, graph coloring, graph pruning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Code Placement using Temporal Profile Information

Instruction cache performance is important to instruction fetch efficiency and overall processor performance. The layout of an executable has a substantial effect on the cache miss rate and the instruction working set size during execution. This means that the performance of an executable can be improved significantly by applying a code-placement algorithm that minimizes instruction cache confl...

متن کامل

cient On - the - y Analysis of ProgramBehavior and Static Cache Simulation ?

The main contributions of this paper are twofold. First, a general framework for control-ow partitioning is presented for eecient on-they analysis, i.e. for program behavior analysis during execution using a small number of instrumentation points. The formal model is further reened for certain analyses by transforming a program's call graph into a function-instance graph. Performance evaluation...

متن کامل

Timing Analysis for Data Caches and Set-Associative Caches

The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control ow of a program, detects and exploits both spatial and temporal locality within data references, produces results typically within a few seconds, and estimates...

متن کامل

Temporal-Based Procedure Reordering for Improved Instruction Cache Performance

As the gap between memory and processor performance continues to grow, it becomes increasingly important to exploit cache memory effectively. Both hardware and software techniques can be used to better utilize the cache. Hardware solutions focus on organization, while most software solutions investigate how to best layout a program on the available memory space. In this paper we present a new l...

متن کامل

Utilizing Block Size Variability to Enhance Instruction Fetch Rate

In the past, instruction fetch speeds have been improved by using cache schemes that capture the actual program flow. In this paper, we elaborate on the architecture and operation of an instruction cache named Variable-Sized Block Cache (VSBC) that also makes use of the dynamic behavior of a program. Current trace-based cache schemes usually have some instructions stored repeatedly; this redund...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 48  شماره 

صفحات  -

تاریخ انتشار 1999